Figure 1: Size factors of initial oyster samples in the study, calculated using oyster gene count data.
After viewing the size factors, it appears that we have 5 samples with really low values that should be removed:
Figure 2: Size factors of oyster samples remaining in the study after 5 outliers were removed, calculated using oyster gene count data.
Without filtering low reads, we had a total of 38828 counts. After filtering out the low counts (those with a base mean less than 3), we now have 21957 counts remaining for all C. virginica samples.
# going to use the filter_counts_out object created above because it is filtered for low reads and has outliers removed
## create the DESeq object
dds <- DESeqDataSetFromMatrix(countData = filter_counts_out,
colData = expDesign,
design = ~ treat)
dds <- DESeq(dds) # differential expression analysis on gamma-poisson distribution
vsd <- varianceStabilizingTransformation(dds, blind = TRUE) # quickly estimate dispersion trend and apply a variance stabilizing transformation
## saving the rlog for DEG heatmap
rlog <- DESeq2::rlog(dds, blind = TRUE) #for use later on
## Save dds and vsd data
save(dds, vsd, rlog, file = "Data/transformed_counts.RData")
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 1500
##
## adonis2(formula = pcaData[6:31] ~ pcaData$infect * pcaData$pCO2, permutations = 1500, method = "eu")
## Df SumOfSqs R2 F Pr(>F)
## pcaData$infect 1 994.1 0.04971 1.2514 0.1646
## pcaData$pCO2 1 787.4 0.03938 0.9912 0.4011
## pcaData$infect:pcaData$pCO2 1 738.0 0.03691 0.9290 0.4897
## Residual 22 17477.6 0.87400
## Total 25 19997.2 1.00000
Figure 3: PCA plot with each point being one oyster sample colored by treatment. Light blue represents the control treatment (no infection, pCO2 level 400), dark blue represents no infection but high pCO2, dark red represents samples infected with boring sponge at pCO2 2800, and orange represents samples infected with boring sponge at pCO2 400.
## log2 fold change (MLE): treat S 400 vs N 2800
## Wald test p-value: treat S 400 vs N 2800
## DataFrame with 21957 rows and 6 columns
## baseMean log2FoldChange lfcSE stat pvalue padj
## <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
## LOC111126949 3.458449 -0.9403532 0.815976 -1.1524281 0.249145 0.999808
## LOC111110729 0.485959 -0.5103647 1.471009 -0.3469488 0.728630 0.999808
## LOC111120752 1.699238 -0.2830392 0.849333 -0.3332488 0.738947 0.999808
## LOC111113860 5.950207 -0.0422788 0.599798 -0.0704884 0.943805 0.999808
## LOC111109550 0.331525 -0.9795688 2.472711 -0.3961518 0.691993 0.999808
## ... ... ... ... ... ... ...
## LOC111117112 0.334073 1.226655 2.15833 0.568335 0.569808 0.999808
## LOC111117113 0.144147 0.770423 3.80623 0.202411 0.839596 0.999808
## LOC111117122 0.114995 0.770418 4.24705 0.181400 0.856053 0.999808
## LOC111116908 0.267869 -0.577240 2.45874 -0.234770 0.814387 0.999808
## LOC111117715 0.718084 -0.458299 1.47476 -0.310762 0.755982 0.999808
##
## out of 21956 with nonzero total read count
## adjusted p-value < 0.1
## LFC > 0 (up) : 1, 0.0046%
## LFC < 0 (down) : 0, 0%
## outliers [1] : 31, 0.14%
## low counts [2] : 1, 0.0046%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
summary(res_con_sponge)
##
## out of 21956 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 0, 0%
## LFC < 0 (down) : 0, 0%
## outliers [1] : 31, 0.14%
## low counts [2] : 1, 0.0046%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
summary(res_con_pco2)
##
## out of 21956 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 0, 0%
## LFC < 0 (down) : 0, 0%
## outliers [1] : 31, 0.14%
## low counts [2] : 1, 0.0046%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
summary(res_trt)
##
## out of 21956 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 6, 0.027%
## LFC < 0 (down) : 6, 0.027%
## outliers [1] : 31, 0.14%
## low counts [2] : 11054, 50%
## (mean count < 3)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
summary(res_trt_pco2)
##
## out of 21956 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 0, 0%
## LFC < 0 (down) : 1, 0.0046%
## outliers [1] : 31, 0.14%
## low counts [2] : 1, 0.0046%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
summary(res_stn_pco2)
##
## out of 21956 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 1, 0.0046%
## LFC < 0 (down) : 0, 0%
## outliers [1] : 31, 0.14%
## low counts [2] : 1, 0.0046%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
summary(res_shn_pco2)
##
## out of 21956 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up) : 2, 0.0091%
## LFC < 0 (down) : 3, 0.014%
## outliers [1] : 31, 0.14%
## low counts [2] : 1, 0.0046%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
## [1] "N_400, S_2800"
Figure: Bar plot of significant DEGs. The y axis denotes how many genes there are, with positive values being up-regulated genes, and negative values denoting down-regulated genes.
Figure 4: Venn diagrams depicting number of shared significantly up-regulated / down-regulated genes. Orange venn diagrams depict up-regulated genes, while blue depicts down-regulated genes.
Need to add all treatments with DEGs to the below code (ANGELA)
Figure 5: left: four-way venn diagram showing relationships among up-regulated genes between different treatment comparisons. right: venn diagram showing relationships among down-regulated genes between treatment comparisons with significantly down-regulated genes.
ANOVA test on plasticity data with 2 PCAs (looking at variable treat to see if there is any significant difference of dist)
## Df Sum Sq Mean Sq F value Pr(>F)
## treat 2 345.3 172.66 1.773 0.2
## Residuals 17 1655.1 97.36
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dist ~ treat, data = ge_plast_2)
##
## $treat
## diff lwr upr p adj
## S_2800-N_2800 9.916192 -3.613859 23.446243 0.1747530
## S_400-N_2800 5.485382 -8.597142 19.567905 0.5872639
## S_400-S_2800 -4.430810 -18.513334 9.651713 0.7037767
ANOVA test on plasticity data with all PCAs (looking at variable treat to see if there is any significant difference of dist)
## Df Sum Sq Mean Sq F value Pr(>F)
## treat 2 116.6 58.28 2.209 0.14
## Residuals 17 448.5 26.38
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dist ~ treat, data = ge_plast_all)
##
## $treat
## diff lwr upr p adj
## S_2800-N_2800 5.733443 -1.310008 12.776894 0.1221031
## S_400-N_2800 3.465748 -3.865309 10.796804 0.4619881
## S_400-S_2800 -2.267696 -9.598752 5.063361 0.7119158
ANOVA test on plasticity data with all PCAs (looking at variable pCO2 * infect to see if there is any significant difference of dist)
## Df Sum Sq Mean Sq F value Pr(>F)
## pCO2 1 1.5 1.51 0.057 0.8140
## infect 1 115.1 115.05 4.361 0.0521 .
## Residuals 17 448.5 26.38
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Call:
## aov(formula = dist ~ pCO2 * infect, data = ge_plast_all)
##
## Terms:
## pCO2 infect Residuals
## Sum of Squares 1.5071 115.0533 448.5309
## Deg. of Freedom 1 1 17
##
## Residual standard error: 5.136552
## 1 out of 4 effects not estimable
## Estimated effects may be unbalanced
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dist ~ pCO2 * infect, data = ge_plast_all)
##
## $pCO2
## diff lwr upr p adj
## 2800-400 -0.599026 -5.887027 4.688975 0.8139628
##
## $infect
## diff lwr upr p adj
## S-N 4.410341 -0.6702065 9.490888 0.0846085
ANOVA test on plasticity data with all PCAs (looking at variable pCO2 * infect to see if there is any significant difference of dist)
## Df Sum Sq Mean Sq F value Pr(>F)
## pCO2 1 1.2 1.2 0.012 0.9141
## infect 1 344.2 344.2 3.535 0.0773 .
## Residuals 17 1655.1 97.4
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Call:
## aov(formula = dist ~ pCO2 * infect, data = ge_plast_2)
##
## Terms:
## pCO2 infect Residuals
## Sum of Squares 1.1677 344.1580 1655.0846
## Deg. of Freedom 1 1 17
##
## Residual standard error: 9.867012
## 1 out of 4 effects not estimable
## Estimated effects may be unbalanced
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dist ~ pCO2 * infect, data = ge_plast_2)
##
## $pCO2
## diff lwr upr p adj
## 2800-400 -0.5272857 -10.68522 9.63065 0.9140742
##
## $infect
## diff lwr upr p adj
## S-N 7.62784 -2.131589 17.38727 0.1175017
ANOVA test on plasticity data with all PCAs (looking at variable treat to see if there is any significant difference of dist)
## Df Sum Sq Mean Sq F value Pr(>F)
## treat 2 116.6 58.28 2.209 0.14
## Residuals 17 448.5 26.38
## Call:
## aov(formula = dist ~ treat, data = ge_plast_all)
##
## Terms:
## treat Residuals
## Sum of Squares 116.5604 448.5309
## Deg. of Freedom 2 17
##
## Residual standard error: 5.136552
## Estimated effects may be unbalanced
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = dist ~ treat, data = ge_plast_all)
##
## $treat
## diff lwr upr p adj
## S_2800-N_2800 5.733443 -1.310008 12.776894 0.1221031
## S_400-N_2800 3.465748 -3.865309 10.796804 0.4619881
## S_400-S_2800 -2.267696 -9.598752 5.063361 0.7119158
Figure 6: Assessment of plasticity (A-D) calculated as the average of the distances of each sample to the mean eigenvalue of the control (N_400). The control was omitted from this analysis. A) and C) pertains to analysis of all PCAs, while B) and D) contain analysis of 2 PCAs. A-B depicts box plots of plasticity and each individual sample as a point, while C-D shows the mean and standard error.
Infected oyster samples in the 2800 pCO2 treatment exhibited the highest plasticity (Figure __), but results from the ANOVA test with all PCAs suggest this was not statistically significant (p = 0.17, F = .967). Oysters not infected with sponge and in the 2800 treatment had the lowest plasticity (Figure _).
All significant GO terms identified from DEGs:
BP_DEG_tree: CC_DEG_tree:
MF_DEG_tree:
Trying Dan’s way…
## 19086 19103 19264 19370 19516 19642 19158
## LOC111126949 4.042424 3.149557 3.557635 2.946954 4.077215 3.084719 2.446078
## LOC111110729 3.075687 3.149557 2.446078 2.946954 2.446078 2.446078 2.932499
## LOC111120752 3.813375 3.431561 3.241539 3.305232 3.371873 3.084719 3.405743
## LOC111113860 3.933769 3.149557 4.068161 4.374763 2.986637 3.694497 3.991978
## LOC111109550 2.446078 2.446078 3.012069 2.446078 2.446078 2.446078 2.446078
## LOC111109753 4.141743 3.149557 3.983449 3.150954 4.216418 2.446078 2.446078
## 19254 19274 19304 19467 19472 19549
## LOC111126949 2.446078 4.194953 3.574840 3.743548 3.611405 4.109706
## LOC111110729 3.082038 2.446078 2.446078 2.446078 2.446078 3.156318
## LOC111120752 3.082038 2.446078 3.429584 3.211660 3.194863 2.446078
## LOC111113860 3.947781 3.556237 3.254143 3.929106 4.255637 3.311708
## LOC111109550 2.446078 3.359705 2.446078 2.990563 2.446078 2.446078
## LOC111109753 3.689479 2.446078 2.446078 3.211660 2.446078 3.311708
## Flagging genes and samples with too many missing values...
## ..step 1
## [1] TRUE
One sample (19611) was flagged as an outlier and removed.
## pickSoftThreshold: will use block size 3430.
## pickSoftThreshold: calculating connectivity for given powers...
## ..working on genes 1 through 3430 of 13043
## ..working on genes 3431 through 6860 of 13043
## ..working on genes 6861 through 10290 of 13043
## ..working on genes 10291 through 13043 of 13043
## Power SFT.R.sq slope truncated.R.sq mean.k. median.k. max.k.
## 1 1 0.0740 18.50 0.978 6540.00 6540.00 6820.0
## 2 2 0.0147 -4.00 0.971 3450.00 3450.00 3810.0
## 3 3 0.3240 -10.70 0.959 1910.00 1900.00 2290.0
## 4 4 0.4290 -8.48 0.943 1090.00 1080.00 1450.0
## 5 5 0.5480 -6.54 0.929 648.00 634.00 967.0
## 6 6 0.7020 -6.03 0.949 396.00 383.00 688.0
## 7 7 0.7930 -5.48 0.964 249.00 237.00 516.0
## 8 8 0.8600 -5.03 0.977 161.00 150.00 399.0
## 9 9 0.8980 -4.62 0.981 106.00 97.30 315.0
## 10 10 0.9290 -4.26 0.989 71.20 64.10 255.0
## 11 12 0.9500 -3.75 0.991 34.10 29.10 174.0
## 12 14 0.9490 -3.42 0.986 17.40 14.00 125.0
## 13 16 0.9570 -3.09 0.990 9.43 7.07 92.6
## 14 18 0.9440 -2.91 0.985 5.37 3.71 70.7
## 15 20 0.9480 -2.71 0.989 3.19 2.02 55.1
## quartz_off_screen
## 2
## mergeCloseModules: Merging modules whose distance is less than 0.4
## multiSetMEs: Calculating module MEs.
## Working on set 1 ...
## moduleEigengenes: Calculating 43 module eigengenes in given set.
## multiSetMEs: Calculating module MEs.
## Working on set 1 ...
## moduleEigengenes: Calculating 27 module eigengenes in given set.
## multiSetMEs: Calculating module MEs.
## Working on set 1 ...
## moduleEigengenes: Calculating 24 module eigengenes in given set.
## multiSetMEs: Calculating module MEs.
## Working on set 1 ...
## moduleEigengenes: Calculating 23 module eigengenes in given set.
## multiSetMEs: Calculating module MEs.
## Working on set 1 ...
## moduleEigengenes: Calculating 22 module eigengenes in given set.
## multiSetMEs: Calculating module MEs.
## Working on set 1 ...
## moduleEigengenes: Calculating 21 module eigengenes in given set.
## Calculating new MEs...
## multiSetMEs: Calculating module MEs.
## Working on set 1 ...
## moduleEigengenes: Calculating 21 module eigengenes in given set.
FALSE quartz_off_screen
FALSE 2
Figure: Cluster Dendrogram
Figure SXX. All identified WGCNA modules correlated against significant traits with R2 and p values.
Figure SXX. Heatmap and barplots of samples for all significant traits identified within each WGCNA module.
Session information from the last full knit of Rmarkdown on 07 February 2023.
## R version 4.2.0 (2022-04-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] cowplot_1.1.1 DEGreport_1.33.1
## [3] pheatmap_1.0.12 RColorBrewer_1.1-3
## [5] gplots_3.1.3 gridExtra_2.3
## [7] WGCNA_1.71 fastcluster_1.2.3
## [9] dynamicTreeCut_1.63-1 flashClust_1.01-2
## [11] ggvenn_0.1.9 adegenet_2.1.8
## [13] ade4_1.7-19 ggrepel_0.9.1
## [15] pdftools_3.3.1 ggpubr_0.4.0
## [17] data.table_1.14.2 vegan_2.6-2
## [19] lattice_0.20-45 permute_0.9-7
## [21] DESeq2_1.36.0 SummarizedExperiment_1.26.1
## [23] Biobase_2.56.0 MatrixGenerics_1.8.1
## [25] matrixStats_0.62.0 GenomicRanges_1.48.0
## [27] GenomeInfoDb_1.32.4 IRanges_2.30.1
## [29] S4Vectors_0.34.0 BiocGenerics_0.42.0
## [31] plotly_4.10.0 forcats_0.5.2
## [33] stringr_1.4.1 dplyr_1.0.10
## [35] purrr_0.3.5 readr_2.1.3
## [37] tidyr_1.2.1 tibble_3.1.8
## [39] ggplot2_3.4.0 tidyverse_1.3.2
## [41] knitr_1.40
##
## loaded via a namespace (and not attached):
## [1] utf8_1.2.2 tidyselect_1.2.0
## [3] RSQLite_2.2.18 AnnotationDbi_1.58.0
## [5] htmlwidgets_1.5.4 BiocParallel_1.30.4
## [7] munsell_0.5.0 codetools_0.2-18
## [9] ragg_1.2.3 preprocessCore_1.58.0
## [11] interp_1.1-3 withr_2.5.0
## [13] colorspace_2.0-3 highr_0.9
## [15] rstudioapi_0.14 ggsignif_0.6.3
## [17] labeling_0.4.2 lasso2_1.2-22
## [19] GenomeInfoDbData_1.2.8 mnormt_2.1.1
## [21] bit64_4.0.5 farver_2.1.1
## [23] vctrs_0.5.1 generics_0.1.3
## [25] xfun_0.33 qpdf_1.3.0
## [27] R6_2.5.1 doParallel_1.0.17
## [29] clue_0.3-61 locfit_1.5-9.6
## [31] reshape_0.8.9 bitops_1.0-7
## [33] cachem_1.0.6 DelayedArray_0.22.0
## [35] assertthat_0.2.1 vroom_1.6.0
## [37] promises_1.2.0.1 scales_1.2.1
## [39] nnet_7.3-18 googlesheets4_1.0.1
## [41] gtable_0.3.1 rlang_1.0.6
## [43] genefilter_1.78.0 systemfonts_1.0.4
## [45] GlobalOptions_0.1.2 splines_4.2.0
## [47] rstatix_0.7.0 lazyeval_0.2.2
## [49] gargle_1.2.1 impute_1.70.0
## [51] broom_1.0.1 checkmate_2.1.0
## [53] yaml_2.3.5 reshape2_1.4.4
## [55] abind_1.4-5 modelr_0.1.9
## [57] backports_1.4.1 httpuv_1.6.6
## [59] Hmisc_4.7-1 tools_4.2.0
## [61] psych_2.2.9 logging_0.10-108
## [63] ellipsis_0.3.2 jquerylib_0.1.4
## [65] ggdendro_0.1.23 Rcpp_1.0.9
## [67] plyr_1.8.8 base64enc_0.1-3
## [69] zlibbioc_1.42.0 RCurl_1.98-1.9
## [71] rpart_4.1.16 deldir_1.0-6
## [73] GetoptLong_1.0.5 haven_2.5.1
## [75] cluster_2.1.4 fs_1.5.2
## [77] magrittr_2.0.3 circlize_0.4.15
## [79] reprex_2.0.2 googledrive_2.0.0
## [81] hms_1.1.2 mime_0.12
## [83] evaluate_0.17 xtable_1.8-4
## [85] XML_3.99-0.11 jpeg_0.1-9
## [87] readxl_1.4.1 shape_1.4.6
## [89] compiler_4.2.0 KernSmooth_2.23-20
## [91] crayon_1.5.2 htmltools_0.5.3
## [93] mgcv_1.8-40 later_1.3.0
## [95] tzdb_0.3.0 Formula_1.2-4
## [97] geneplotter_1.74.0 lubridate_1.8.0
## [99] DBI_1.1.3 ComplexHeatmap_2.12.1
## [101] dbplyr_2.2.1 MASS_7.3-58.1
## [103] Matrix_1.5-1 car_3.1-0
## [105] cli_3.4.1 parallel_4.2.0
## [107] igraph_1.3.5 pkgconfig_2.0.3
## [109] foreign_0.8-83 xml2_1.3.3
## [111] foreach_1.5.2 annotate_1.74.0
## [113] bslib_0.4.0 XVector_0.36.0
## [115] rvest_1.0.3 digest_0.6.29
## [117] ConsensusClusterPlus_1.60.0 Biostrings_2.64.1
## [119] rmarkdown_2.17 cellranger_1.1.0
## [121] htmlTable_2.4.1 edgeR_3.38.4
## [123] gtools_3.9.3 shiny_1.7.2
## [125] rjson_0.2.21 lifecycle_1.0.3
## [127] nlme_3.1-160 jsonlite_1.8.2
## [129] carData_3.0-5 seqinr_4.2-16
## [131] limma_3.52.4 viridisLite_0.4.1
## [133] askpass_1.1 fansi_1.0.3
## [135] pillar_1.8.1 KEGGREST_1.36.3
## [137] fastmap_1.1.0 httr_1.4.4
## [139] survival_3.4-0 GO.db_3.15.0
## [141] glue_1.6.2 png_0.1-7
## [143] iterators_1.0.14 bit_4.0.4
## [145] stringi_1.7.8 sass_0.4.2
## [147] blob_1.2.3 textshaping_0.3.6
## [149] caTools_1.18.2 latticeExtra_0.6-30
## [151] memoise_2.0.1 ape_5.6-2